A Shape Based Post Processor for Gurmukhi OCR

نویسندگان

  • Gurpreet Singh Lehal
  • Chandan Singh
  • Ritu Lehal
چکیده

A shape based post processing system for an OCR of Gurmukhi script has been developed. Based on the size and shape of a word, the Punjabi corpora has been split into different partitions. The statistical information of Punjabi language syllable combination, corpora look up and holistic recognition of most commonly occurring words have been combined to design the post processor. An improvement of 3% in recognition rate from 94.35% to 97.34% has been reported on machine printed images using the post processing techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A post-processor for Gurmukhi OCR

A post-processing system for OCR of Gurmukhi script has been developed. Statistical information of Punjabi language syllable combinations, corpora look-up and certain heuristics based on Punjabi grammar rules have been combined to design the post-processor. An improvement of 3% in recognition rate, from 94.35% to 97.34%, has been reported on clean images using the post-processing techniques.

متن کامل

A Complete Machine printed Gurmukhi OCR System

Recognition of Indian language scripts is a challenging problem. Work for the development of complete OCR systems for Indian language scripts is still in infancy. Complete OCR systems have recently been developed for Devanagri and Bangla scripts. Research in the field of recognition of Gurmukhi script faces major problems mainly related to the unique characteristics of the script like connectiv...

متن کامل

A Hybrid Approach to Classify Gurmukhi Script Characters

Researchers have worked extensively on OCR, in the past few decades. This is also visible from the fact that various types of OCR are available in the market. Out of these available OCR’s majority is to support foreign languages. In Indian context, majority of available OCR’s are for Hindi and Bangla, but a very few reports are available on Gurmukhi script which is used to write Punjabi languag...

متن کامل

Segmentation of Broken Characters of Handwritten Gurmukhi Script

Character Segmentation of Handwritten Documents has been an active area of research and due to its diverse applicable environment; it continues to be a challenging research topic. The desire to edit scanned text document forces the researchers to think about the optical character recognition (OCR). OCR is the process of recognizing a segmented part of the scanned image as a character. OCR proce...

متن کامل

Feature Extraction and Classification Techniques in O.C.R. Systems for Handwritten Gurmukhi Script – A Survey

Optical character recognition (OCR) is very popular research field since 1950’s. A great work has been done for various scripts particularly in case of English. But in case of Indian scripts the research is limited. This paper presents an overview of the various O.C.R. systems for gurmukhi which are developed for handwritten isolated gurmukhi text. In case of printed gurmukhi text a lot of rese...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001